0{N) continuous electrostatics solvation energies calculation method for biomolecules 

simulations. 
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We report a development of a new fast surface-based method for numerical calculations of solvation 
energy of biomolecules with a large number of charged groups. The procedure scales linearly with the 
system size both in time and memory requirements, is only a few percents wrong for any molecular 
configurations of arbitrary sizes, gives explicit value for the reaction field potential at any point, 
provides both the solvation energy and its derivatives suitable for Molecular Dynamics simulations. 
The method works well both for large and small molecules and thus gives stable energy differences 
for quantities such as solvation energies of molecular complex formation. 



Solvent interactions play an essential role in Nature 
in determining electrostatic potential energies of molecu- 
lar conformations, charge states of proteins, dissociation 
constants of small molecules, and binding properties of 
protein-ligand complexes. A solvation energy calculation 
for a molecule-sized object has always been and still is a 
challenging problem. The most accurate approach is, ap- 
parently, a large scale MD simulation [1, 2\ of the body 
of interest immersed in a tank of water molecules in a 
realistic force field or even within quantum mechanical 
settings. Though such an approach may in principle pro- 
vide ultimately accurate predictions, the calculations are 
time consuming and pose a number of specific problems 
stemming, e.g. from long relaxation times of water clus- 
ters. One possible way to bridge the "simulation gap" is 
to employ different types of continuous solvation models. 
Fortunately, water is characterized by a very large value 
of dielectric constant and therefore the reaction field of 
water molecules is collective in nature. Although realis- 
tic properties of molecular interactions depend both on 
short-scale water molecules alignment and on the long- 
range dipole-dipole interactions at the same time [3", "4], 
purely electrostatic models, such as Poisson-Boltzmann 
equation solvers [5, 6j, turned out to be very successful 
in various applications. 

Even within the realm of continuous electrostatic mod- 
els there are numerous approaches in use to calculate the 
polar part of the solvation energies. Popular techniques 
span from finite element methods (FEM, [3 H H |3 H 
[Tot [TTJ [12]) to various types of Generalized Born (GB) 
approximations [ll[ll[I3[ll[T71[Tl[Ta|20l[2^ A nu- 
merical FEM solution to the Poisson-Boltzmann equation 
(PBE) is a formally fast (the calculation time and mem- 
ory scale (X with N being the number of particles in 
the system) and is a rigorous attempt to solve the elec- 
trostatics problem. On the other hand FEM involve a 
good numerical overhead and in practice GB approxima- 
tions are faster, in spite of the fact that it normally takes 
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0{N'^) operations to calculate GB energy. Unfortunately 
GB approximations are very rough and that is why GB 
calculations work well only for small and medium sized 
molecules, whereas FEM methods can, although at ex- 
pense of a numerical complexity, be applied to very large 
systems. The particular boundary between the applica- 
bility of the two methods depends on the balance of speed 
the amount of details and accuracy required in a specific 
application. 

In this Letter we push forward our recently established 
connection [22] between the Generalized Born (GB) mod- 
els [HI [151 [T71 EH [23 [2l|25] and the boundary inte- 
gral formulation of the electrostatics problem [6j. We 
show that the GB solvation energy can in fact be cal- 
culated in linear time and memory for arbitrary system 
of charges. Using post-Coulomb Field Approximation 
(post-CFA) for the Born radii calculations we report a 
development of a new fast surface-charges density based 
method (Surface Charges GB, SCGB) for numerical so- 
lution of the Poisson-Boltzmann equations and use it for 
the solvation energy calculations for biomolecules with a 
large number of charged groups. The procedure turns out 
only a few percents wrong for realistic molecular config- 
urations, gives explicit value for the reaction field poten- 
tial at any point within the system and provides both the 
solvation energy and its derivatives suitable for Molecu- 
lar Dynamics (MD) simulations. The method works well 
both for large and small molecules and thus gives stable 
energy differences for quantities such as solvation ener- 
gies of a molecular complex formation. 

Modern implicit water methods trade accuracy and 
physical sophistication for speed and usually are based 
on assumptions [T71 [25] traceable back to the original 
approach of Born [26]. Consider a molecule modeled as 
a system of charges confined within a water cavity as 
shown on Fig|l] The shape of the cavity can be either 
obtained by displacing the water out of all the atomic 
volumes and then collecting the atomic volumes into the 
molecular volume [ III [24l [27] • An alternative can be the 
molecular volume separated from the water bulk by a suf- 
ficiently smooth interface surface Tw containing all the 
atoms and having no unphysical water-filled caverns in- 
side [28l[23. Neither of approaches is ideal, though the 



2 




* * 

p 



Figure 1: Schematic representation of a charged biomolecule 
in a continuous water model. 

surface based methods often produce better molecular 
volumes. The polar contribution to the solvation energy 
(the solvation energy) of the molecule is given by 

1 ^ 

Es = ^^q^Mr^). (1) 

i=l 

where (pi is the so called reaction field potential produced 
by the water polarization charges as explained in e.g. [6j. 
Here the Latin indices i = 1...N enumerate the charges, 
N is the total number of charges, qi and are the charges 
and the positions of the ions. 

The actual calculation of the reaction field potential 
depends on further assumptions and may be performed 
in a number of ways. Since the dielectric constant in 
water is large {ew 80 1), the electric potential on 
the molecular surface vanishes to a very good accuracy: 
(p{r) |r^= (the so called "ideal conductor" approxima- 
tion). Therefore the polarization charges are confined to 
the interface and the reaction field can be approximated 
as 

Here as {r') is the surface density of the polarization 
charges and df is the molecular surface element. The 
total electric potential at a given point r is (/?(r) = 
(fo{^) + ^i(r), where 

N 



is the Coulomb potential generated by the molecular 
charges. The surface charge density a satisfies the in- 



tegral equation 

Jfw |r-r'| ^ |r-ri| 

(3) 

If the molecular surface is properly discreticized then 
both the polarization charge density a and the solvation 
energy can be obtained iteratively in 0{NlnN) opera- 
tions with the help of either FFT or fast multipole meth- 
ods for fast matrix-vector products and proper precon- 
ditioners [30l [SH IH [33l [335 . In practice the number of 
iterations required for full convergence is far from a few 
and the whole calculation is nevertheless fairly compu- 
tationally demanding. Another problem arises from the 
fact that applications such as MD simulations or mini- 
mizations require derivatives with respect to the atoms 
coordinates. Naturally, finding a derivative of an iter- 
atively obtained solution is not an easy task. That is 
why a substantial effort was put in finding reasonable 
approximate solutions to Eq. (|3| as described in the re- 
cent publications [35^, ^36\ and the refs. therein. 

Historically Generalized Born (GB) methods provide 
an apparently different way of the solvation energy cal- 
culations [Ql [II [13 [II [13 [la [la llQl [21] . In our recent 
work [22j we established the link between the surface elec- 
trostatics and GB models. It turns out that GB models 
can be also used to provide the reaction field potential 
approximation within the molecule and to calculate the 
polarization charge density. To do that we reintroduce 
GB models following our presentation in [22j using the 
simplest Kirkwood-like form of the reaction field poten- 
tial [371 138J 

^(r)=^o + ^.-|:..(^-^), (4) 

where 

S^ir) = y'(r-r,-f +i?(r,)i?(r), 

and R{t) is a properly chosen function. Specific expres- 
sions for the function R are different in different mod- 
els and are expressed either in terms of either volume 
[Il[ll[l5l[ia[I71[l8l[Ii[20l[21] or surface integrals 
da [21 [211 [39]. The values of the function R{t) at the 
positions of the charges are called the respective Born 
radii, Ri = R{ri) of the ions. The surface and the vol- 
ume integral formulation dichotomy of GB models has a 
long history and the models defined with a help of prop- 
erly chosen molecular surfaces (see e.g. [28", "29]) have a 
good number of practical advantages [40j. Normally the 
polar part of the solvation energy is obtained by plug- 
ging the expression from the Eq. Q for the reaction 
field potential into the Eq. ^ [22, 37, 38j: 
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where /(r^j) = Sj{ri), ep ~ 1 is the dielectric constant 
of the molecular interior. The expression implies dou- 
ble summation over the molecular charges and requires 
0{N'^) operations to compute. 

One of the simplest way to calculate the Born radii 
comes from the so called Coulomb field approximation 
(CFA) [ll[35l|3i[4ll'42]: the electric displacement vec- 
tor in a nonuniform medium is taken as that in the vac- 
uum. The CFA is wrong for the ions next to the protein 
boundary [T71 |24l |25l [S^, which is a problem indeed, 
since most of the charges within typical biomolecules are 
located next to the molecular surfaces. There are a few 
ways to go beyond the CFA and obtain a more accurate 
approximation. The first class of the models was intro- 
duces in a number of works [lIl|2l[2l[25l[38l[lQlll3[lll 
in either of the equivalent volume or surface integral for- 
mulations 
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df\ (6) 



where Si = |s^|, = — r^, and /3 = 5 — 7 is a (varia- 
tional) parameter. The integration over the water bulk 
W in the middle of (|6| is transformed to the equivalent 
boundary integral form in a standard way with the help 
of the Gauss theorem [43j. Another important model is 
given by 



df 



(7) 



where Ca is the properly chosen constant. The special 
choice of = 6 in the model (|6| and the two models 
described by Eq. ([7f with a = 3,4 and Ca = l/47r are 
exact for an arbitrary system of charges within a spherical 
"molecule" [22]. Though the specific choice of the Born 
radii method is not important for the following consid- 
erations, we naturally prefer these "inherently accurate" 
models and call them SCGB (Eq. (|6| with P = 6) and 
SCGB(3) or SCGB(4) (Eq. with a = 3,4). 

To obtain a faster method we suggest to use the model 
potential Q to calculate the polarization charge density 
on the molecular surface a from the electrostatic bound- 
ary condition in a standard way 



a = 



1 dip 
47r dn ' 



(8) 



Next to the molecular surface (r' Tw) the functions R 
in each of our models vanishes, R {r') ^ 2h ^ 0, where 
h is the distance from a given point to the surface [22j . 
Therefore 



<T(r') 



1 



47r ^ |r' 



Rj 



(9) 



Once the surface charge density is known, we can use 
Eq. ^ to obtain the solvation energy in the same way 
as if cr is a solution of the surface electrostatics. In what 



follows we call the method Surface Charges Generalized 
Born (SCGB). 

Let us summarize the solvation energy calculation al- 
gorithm in a few lines: 

1. given a set of molecular charges Qi located at the 
positions and a useful discretization of the sur- 
face, representing the molecule-water interface, we 
calculate first the set of Born radii with the help 
of the surface integration according to either of Eq. 
^ and Eq. ([7| with properly chosen values of a 
or p. 

2. as soon as the Born radii are ready, we calculate 
the surface charge density at every point on the 
molecular surface according to Eq.(|9|. 

3. now when the surface charge density is known, we 
can calculate the solvation energy using the exact 
expressions ([T]) and 

Although the apparent computational complexity of the 
outlined procedures is 0{M x N) (or better say in 
0{M xNlog{M xN))), where M is the number of points 
in the discrete representation of the molecular surfaces, 
the discrete summation involves only the coordinates dif- 
ferences and thus the calculation can be performed in 
0{M-\-N) operations with the help of either FFT or fast 
multipole methods. 

SCGB approximation is by no means exact, A(f ^ 0, 
and hence there can be (and in fact there are) superfi- 
cial polarization charges in the water bulk and within the 
molecule. Let us perform a few simple model calculation 
to see how accurate the suggested SCGB procedure can 
be. Consider first a charge placed somewhere at the po- 
sition within a sphere of a radius a. Then, a simple 
calculation yields 



(10) 



and the model expression for the electrostatic potential 
coincide with the exact result e.g. from [45j 



1 



3 I J 



(11) 



with fj = Vj/vj. In the same way the surface charge 
density calculated from this expression for the potential 
according to Eq.Q coincides with that given by Eq. (|9|: 



47ra |r' 



Since as = crj is an additive quantity, SCGB approx- 
imations gives the exact result for as for an arbitrary 
charge distribution within a sphere. An interesting case 
corresponds to a sphere with a = oo, that is a very large 
molecule occupying a half-space. 
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SCGB models for a layer 
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Figure 2: Solvation energies to exact solvation energy ratios 
comparison for a charge placed within a dielectric layer of 
thickness L. The upper, middle, and lower curves describe 
the solvation energy for SCGB, SCGB{3,4), correspond- 
ingly. All the quantities approach the exact value on the 
molecule boundaries {z = 0,L). 



SCGB approach can not, of course, be exact for an 
arbitrary molecule geometry. Consider another practi- 
cally important example: a plain layer-like "molecule" (or 
membrane) of the thickness L surrounded by the contin- 
uous water on both sides with a charge q placed inside 
the layer at the distance z from one of the water interface 
planes. The exact result for the solvation energy of the 
system is [46, 47J 



(Es). 



Jo 



dk 



smh{kz) smh{k {1 — z)) 1 
sinh {k) 2 



(12) 

where z = z/L. The results for the layer-like molecule in 
all the three SCGB approaches are: 



1 - 2z(l 



SCGB 



L 4:z{l-z) ^l-3z{l-z)' 



l-2z(l-z) 
~T 4z{l-z) ' 

g Vl-2z(l-z) 
L Az{l-z) ' 



We compared them with the exact result of Eq. ([12| on 
Figure [2] All the quantities approach the exact value on 
the molecule boundaries {z = 0, L) and differ from the 
exact solution in the middle of the layer. 

Another challenging case is the calculation for a single 
charge q placed within a wedge made of the two perpen- 
dicular infinite walls (the "x2:" and "?/2;" planes). The 
SCGB results are 



(Es) 



q^ (2 — sin cos ip) (sin + cos (p) 



SCGB 



> sm ip cos (p) 



(sin ip cos (p) 



Table I: SCGB calculations examples (all the methods) for a 
point on a cylinder axis {R is the radius of the cylinder) and 
in the center of a cube of the size a. 



iEs)4 



q^ TT (2 — sin (p cos (p) (sin (p + cos p)) 

r 8 sin (/:) cos (p [(tt — (p) cos (| +(/:?) sin (^] ' 



q^ Y^(2 — sin (p cos (p) (sin (p + cos (p) 



sm p) cos (p 



where p) is the azimuthal angle between the position of a 
charge and the xz-plane, r is the distance separating the 
charge from the wedge. The results should be compared 
with the exact solvation energy 



{Es), 



q^ sm p? + cos p? — simp cos p? 
r 4 sin cp cos p? 



The error can be analyzed by observing the ratio Qa = 
(Es)^ I (Es)^^, which is the largest at = 7r/4 and 

QscGB = 1.36, Qs = 0.91, g4 = 1.13. 

The measures of the error are reasonable though of course 
not perfect. To build more confidence we have also per- 
formed the calculations for a charge placed on the axis of 
an infinite cylinder of the radius R and in the center of 
a cube of the size a (see Table |l| with roughly the same 
results. 

All the calculations presented in this Section so far 
may be fair but concern only a few oversimplified exam- 
ples produced for model systems with idealized geome- 
tries. To judge on the actual performance of the method 
we turn to a practically interesting realistic system: sol- 
vation energy calculations for 7V8-neuraminidase protein 
(pdb accession code 2htl). The molecule is composed of 
387 amino acids and, after all the hydrogen atoms added, 
has 5866 atoms. The results of the calculations are rep- 
resented on Fig. [3) The horizontal axis represents the 
Born radii taken from "exact" solvation energy Es using 
the "definition" 

Rb = -q^/2Es. 

The quantity Es was found "exactly" by solving FEM 
version of Eq. ^ . The vertical axis shows the Born radii 
obtained by our SCGB(3) model with the help of Q with 
a = 3. The results of the calculations agree pretty well 
in the small Born radii region and diverge in the protein 
center (large Born radii region). This behavior is well 
expected, since it is exactly the center of a large molecule 
which is the region where the divergence between the 
SCGB and exact solvation energy is the most (compare 
e.g. with Fig. |2]). 
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Surface charges GB radii vs. exact 



Water polarization charge in SC3B models 
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Figure 5: Water polarization charge compensation (overall 
neutrality) demonstration SCGB calculations. 



Figure 3: SCGB (3) Born radii for the atoms of 2/it7 protein 
vs. the "exact" values obtained using a calculation based on 
surface FEM method. 

SCGB models vs. FEM 
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Figure 4: 580 proteins, SCGB solvation vs. the surface FEM 
method. 

The single-charge calculations represented on Fig. |3] 
is an interesting but not an ultimate test of the model. 
What counts in realistic approximations is of course the 
solvation energy of a large molecule with a complicated 
shape of the molecular surface and a sophisticated atomic 
charge distribution. We applied all the four SCGB mod- 
els to a set of 580 proteins from the Quantum Pharma- 
ceuticals Binding library and presented the results (the 
data on the vertical axis) in correlation with the solvation 
energy obtained with a surface FEM method (the data on 
the horizontal axis) on Figure [ij The blue dots show the 
performance of SCGB with the Born radii obtained with 
the standard CFA formulas. The CFA-based method fails 
pretty miserably, whereas all the other SCGB are in good 
agreement with the "exact" FEM calculations. Although 
all three post-CFA SCGB methods are nearly all as good 
as each other, the green dots representing the SCGB{A) 
model give a somewhat better approximation. 



The reason behind the distinction of the SCGB {A) 
model may stem from the "inherent" absolute neutrality 
of the system of the protein and the water polarization 
charges in the model. Indeed, Eq. j9| for the surface 
charges density combined with Eq. ([7f with a = 3 give 
ensure overall neutrality of the system 

Qs= f dfcjs(v') = -Y,(lj (13) 

for arbitrary surface geometry and the charges distri- 
butions. This does not tell of course that the other 
two SCGB models are much worse, the abilities of the 
methods to recover correct solvation energies are approx- 
imately the same. 

Few concluding remarks should be placed here. Obvi- 
ously SCGB is not able to provide exact solutions to the 
electrostatics problem. In fact in many of the practical 
applications this may well not be an issue: genuine water 
environment is neither continuous or describable in terms 
of simple electrostatics. SCGB is clearly computationally 
superior to classic GB implementations both in speed and 
accuracy since the calculations can be done in 0{N) in- 
stead of 0{N'^). In fact, the real comparison should be 
made to iterative surface electrostatic solvers, which can 
also be made 0(7V)— fast. The advantage comes from the 
fact that SCGB solution can be obtained in a number of 
steps roughly equal to the number of operations required 
for a single iteration of surface based FEM electrostatics 
solver. Another advantage of SCGB stems from availabil- 
ity of numerical derivatives for any surface implementa- 
tion with surface areas and normals. 

The authors are indebted to Quantum Pharmaceuti- 
cals for support. The solvation energy contribution in- 
troduced this report is implemented in a number of Quan- 
tum Pharmaceuticals models and employed in Quan- 
tum's drug discovery applications. PCT application is 
filed. 
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